Week 06 - Assessing Populations at Risk
Task 3: Assessing Risk with Demographic data
This script continues on 05_NetworkDistancesDK.Rmd and should be run in sequence.
There are also individuals with disabilities, the elderly, or others requiring accessible transport, who do not have access to a car. Getting to these hospital sites may prove difficult in areas outside the isochrones. Within large cities, this problem may be solved with public transport, such as the letbanen in Aarhus. Accessibility may also be less of an issue in areas where car ownership is widespread. We can analyze this additional variable with demographic data, also obtained within R.
Wrangle data from DS
In the next exercise, you will find out which kommunes face the problem of having lots of high-risk households, where there is a no car, and they are outside the isochrones and in rural areas (out of the reach of public transport)
Number of households without cars are available via Danmark Statistik via the “Familiernes bilrådighed (faktiske tal) efter rådighedsmønster, område” dataset. The “familier uden biler” sits in csv format as a no_cars.csv with the numbers corresponding to 2020 and 2021 respectively. Overall, there were 1178935 households without a car in 2020 and 1166668 in 2021. The number of course drops with time, while number of car-owning households go up steadily from 1,896,387 in 2020 to 1,929,511 in 2021
- Load the dataset
- rename the V3 and V4 column to nc2020 and nc2021. (In the provided dataset the numeric columns are number of carless households in year 2020 and in 2021
no_cars <- _________("../data/no_cars.csv", header = FALSE)
names___________ <- c("municipality","nc2020", "nc2021") It would be nice to know the percentage of households without a car Next, get municipalities of Midtjylland and join up with the no_cars attributes, selecting which kommunes have the least cars and are therefore most at risk.
Percentages of inhabitants without a car per municipality
Create a simple feature or spatial polygons dataset that contains information on the percentage of households that do not have access to a car.
- First read in the type and count of households
(“husstande-og-familier”) from 2020 and 2021 from DS
. Beware that the dataset has no headers, and read.csv() might perform
better than tidyverse counterpart. If your computer has trouble with
special characters, specify the file’s encoding as
"windows-1252". - Use
group_by()andsummarize()on the husstande object to get a sum of households per kommune. Rename the resulting columns to municipality and households and label the new objecthh_mun - Join the
hh_muntono_carsobject by the shared municipality column withinner_join(). Look up the function to know how exactly to specify the shared columns. - In the resulting
hh_nocarobject, create two new columns where you calculate the percentage of carfree households for 2020 and 2021, using the total household count and the nc2020 and nc2021 columns.
# Load tidyverse
library(tidyverse)
# load household data
hh <- read.csv2("../data/households2020_2021.csv", header = ______, fileEncoding=___________)
# create a summary of households per municipality
hh_mun <- hh %>%
group_by(_____) %>%
summarize(sum=sum(____)) %>%
rename(municipality = V2, households = sum)
# check total households
sum(hh_mun$households)
# join households to no_car dataset and calculate percentages
hh_nocar <- hh_mun %>% inner_join(no_cars, by = "municipality")
hh_nocar <- hh_nocar %>%
mutate(nc2020pct = _______/households*100,
nc2021pct = _______/households*100)[1] 5456264
[1] 11746840
- Load municipality data either via
getData()or from a saved object from Week03 - Check and reconcile the names of municipalities are consistent
across the
hh_nocarand the newmunicobjects. (Culprits like Aarhus, Vesthimmerlands, and Copenhagen often cause trouble). You need both columns of names to contain the same variants of names else the conflicting ones will be dropped during the following join. - Create
no_cars_pct_munobject by joining the carfree householdshh_nocarobject to the municipalities withinner_join(). Beware to preserve the spatial dimension of themunicfor later use in a map. Look up the function to know how exactly to specify the shared columns. You may also want to rename thepopcolumns first so they are easier to map tomunic
[ Extra: * I have provided an extra file
munic_pop_2022Q1.csv which contains the population by
municipality from the first quarter of 2022 and originates from https://www.statbank.dk/FOLK1A . If you wish to compare
and contrast population data with households by municipality, load it
into a pop object and check it out. This way you can answer
questions such as “How many people on average per household?” ]
# load population (not necessary, I just want to see)
pop <- read.csv2("../data/munic_pop_2022Q1.csv", sep = ",", skip = 2, ______= __________)
names(pop)[1] <- "Name"
names(pop)[2] <- "Pop22"
sum(pop$Pop22)
# load municipalities
________
# reconcile municipalities' and hh_nocars' problematic names
________
# join municipalities with the carless households dataset
no_cars_pct_mun <- munic %>% dplyr::inner_join(hh_nocar, by = c("NAME_2"="municipality")) # this preserves the spatial dimension!- Create a thematic map of
no_cars_pct_munby filtering municipalities that have 25+ percent of households without a car. I recommendmapview(), but you are free to use your favorite library. Mapview usually works well at the end of tidyverse pipeline but as it may interfere with other libraries, I recommend saving the script and upon restart, running only themapview()code chunk so as to avoid interference. - Which areas in Midtjylland have 25 and more percent of carless households?
Now of course, you have the whole situation in Denmark. In the next
steps, you need to constrain the municipalities with carless households
spatially to those in Midtjylland and check the overlap with hospital
isochrones, taking the area outside these isochrones with
st_difference().
- Get the regional spatial polygons from the GADM database (level = 1)
- Convert to simple feature with
as("sf")and filter to get ‘Central Jutland’ only - Use this single region to clip the municipalities data on carless
households
no_cars_pct_mun - Look at the data with the
mapviewfunction, usingnc2021pctfor symbology
Gosh, that was a ton of wrangling! But we are nearly there! :)
Spatial overlay with sf
Spatial overlay is a very common operation when working with spatial data. It can be used to determine which features in one spatial layer overlap with another spatial layer, or extract data from a layer based on geographic information. In R, spatial overlay can be integrated directly into tidyverse-style data analysis pipelines using functions in sf. In our example, we want to determine the areas in Midtjylland with the greatest proportion of households without access to a car that also are beyond a 20 minute walk/cycling route from a hospital.
To do this, follow these steps:
- doublecheck that the coordinate reference system of the
no_car_pct_midtdataset is 4326, the same CRS used by the isochrones; - extract only those municipalities with a percentage of households without cars of 25 percent or above;
- use the
st_difference()function to “cut out” areas from those municipalities that overlap the 20-minute cycling isochrones. (Union the isochrones first for a neater output). - Once you complete this operation, visualize the result in your Mapbox map.
Questions:
3. Are there any areas that are located beyond a 20-minute bike-ride of a hospital that have proportionally lower access to cars? Where are they and what is their total area?
4. How would you ensure residents in these areas have reasonable access to hospitals?
8341.584 [m^2]